A Dependency Grammar for Amharic
نویسنده
چکیده
There has been little work on computational grammars for Amharic or other Ethio-Semitic languages and their use for parsing and generation. This paper introduces a grammar for a fragment of Amharic within the Extensible Dependency Grammar (XDG) framework of Debusmann. A language such as Amharic presents special challenges for the design of a dependency grammar because of the complex morphology and agreement constraints. The paper describes how a morphological analyzer for the language can be integrated into the grammar, introduces empty nodes as a solution to the problem of null subjects and objects, and extends the agreement principle of XDG in several ways to handle verb agreement with objects as well as subjects and the constraints governing relative clause verbs. It is shown that XDG’s multiple dimensions lend themselves to a new approach to relative clauses in the language. The introduced extensions to XDG are also applicable to other Ethio-Semitic languages.
منابع مشابه
Toward a Rule-Based System for English-Amharic Translation
We describe key aspects of an ongoing project to implement a rule-based English-to-Amharic and Amharic-to-English machine translation system within our L framework. L is based on Extensible Dependency Grammar (Debusmann, 2007), a multi-layered dependency grammar formalism that relies on constraint satisfaction for parsing and generation. In L, we extend XDG to multiple languages and translation...
متن کاملAn annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملDoubled Clitics are Pronouns: Amharic Objects (and Beyond)
Controversy and uncertainty have plagued the question of whether “object markers” (OMs) are object pronouns cliticized to the verb or realizations of object agreement. Using data from Amharic, we address this question from a new perspective. Specifically, we claim that Amharic OMs should be analyzed as clitics because they are unable to double nominals that are quantified, anaphoric, or contain...
متن کاملA Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کاملاستخراج بیناظر ظرفیت فعل در زبان فارسی
Valency is the key concept in dependency grammar. Among all word categories, verbs are the most important categories with a key role in syntax and semantics. Verb is the central role in a sentence and acts as the main semantic component in the dependency grammar. In this paper, after studying several methods for unsupervised discovery of Persian verb valency, the ambiguities are studied. Among ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010